A Comparison of Prediction Accuracy , Complexity , and Training Time of Thirty - three Old and New Classi cation Algorithms

نویسنده

  • William W. Cohen
چکیده

Twenty-two decision tree, nine statistical, and two neural network algorithms are compared on thirty-two datasets in terms of classi cation accuracy, training time, and (in the case of trees) number of leaves. Classi cation accuracy is measured by mean error rate and mean rank of error rate. Both criteria place a statistical, spline-based, algorithm called Polyclass at the top, although it is not statistically signi cantly di erent from twenty other algorithms. Another statistical algorithm, logistic regression, is second with respect to the two accuracy criteria. The most accurate decision tree algorithm is Quest with linear splits, which ranks fourth and fth, respectively. Although spline-based statistical algorithms tend to have good accuracy, they also require relatively long training times. Polyclass, for example, is third last in terms of median training time. It often requires hours of training compared to seconds for other algorithms. The Quest and logistic regression algorithms are substantially faster. Among decision tree algorithms with univariate splits, C4.5, Ind-Cart, and Quest have the best combinations of error rate and speed. But C4.5 tends to produce trees with twice as many leaves as those from Ind-Cart and Quest.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of Artificial Neural Network Training Algorithms for Predicting the Weight of Kurdi Sheep using Image Processing

Extended Abstract Introduction and Objective: Due to weakness, the occurrence of unwanted errors, the impact of the environment and exposure to natural events, human always make mistakes in their diagnoses of the environment or different topics, so that different people 's perception of a single and unique event may be very different and be diverse. Nowadays, with the development of image proc...

متن کامل

Low Complexity Speaker Authentication Techniques Using Polynomial Classi ers

Modern authentication systems require high-accuracy low complexity methods. High accuracy ensures secure access to sensitive data. Low computational requirements produce high transaction rates for large authentication populations. We propose a polynomial-based classi cation system that combines high-accuracy and low complexity using discriminative techniques. Traditionally polynomial classi ers...

متن کامل

Personal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)

Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...

متن کامل

Fast Intra Mode Decision for Depth Map coding in 3D-HEVC Standard

three dimensional- high efficiency video coding (3D-HEVC) is the expanded version of the latest video compression standard, namely high efficiency video coding (HEVC), which is used to compress 3D videos. 3D videos include texture video and depth map. Since the statistical characteristics of depth maps are different from those of texture videos, new tools have been added to the HEVC standard fo...

متن کامل

A New High-order Takagi-Sugeno Fuzzy Model Based on Deformed Linear Models

Amongst possible choices for identifying complicated processes for prediction, simulation, and approximation applications, high-order Takagi-Sugeno (TS) fuzzy models are fitting tools. Although they can construct models with rather high complexity, they are not as interpretable as first-order TS fuzzy models. In this paper, we first propose to use Deformed Linear Models (DLMs) in consequence pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999